期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

全文获取类型

收费全文	84篇
免费	30篇
国内免费	1篇

专业分类

化学	7篇
力学	1篇
综合类	19篇
数学	7篇
物理学	81篇

出版年

2024年	1篇
2023年	5篇
2022年	1篇
2021年	1篇
2020年	4篇
2019年	3篇
2018年	3篇
2017年	1篇
2016年	8篇
2015年	2篇
2014年	11篇
2013年	2篇
2011年	4篇
2010年	5篇
2009年	2篇
2008年	8篇
2007年	4篇
2006年	5篇
2005年	7篇
2004年	6篇
2003年	1篇
2002年	5篇
2001年	3篇
1998年	1篇
1997年	6篇
1996年	3篇
1995年	4篇
1992年	1篇
1990年	1篇
1989年	2篇
1988年	1篇
1987年	2篇
1986年	1篇
1985年	1篇

排序方式： 共有115条查询结果，搜索用时 15 毫秒

[首页] « 上一页 [3] [4] [5] [6] [7] [8] [9] [10] 11 [12] 下一页 » 末页»

101.

基于Transformer编码器的合成语声检测系统

下载免费PDF全文

万伊杨飞然杨军《应用声学》2023,42(1):26-33

自动说话人认证系统是一种常用的目标说话人身份认证方案,但它在合成语声的攻击下表现出脆弱性,合成语声检测系统试图解决这一问题。该文提出了一种基于Transformer编码器的合成语声检测方法,利用自注意力机制学习输入特征内部的长期依赖关系。合成语声检测问题并不关注句子的抽象语义特征,用参数量较小的模型也能得到较好的检测性能。该文分别测试了4种常用合成语声检测特征在Transformer编码器上的表现,在国际标准的ASVspoof2019挑战赛的逻辑攻击数据集上,基于线性频率倒谱系数特征和Transformer编码器的系统等错误率与串联检测代价函数分别为3.13%和0.0708,且模型参数量仅为0.082 M,在较小参数量下得到了较好的检测性能。相似文献

102.

一种基于聚类的门控卷积网络语声分离方法*

下载免费PDF全文

罗宇胡维平吴华楠《应用声学》2023,42(5):1099-1105

基于深度聚类的语音分离方法已被证明能有效地解决混合语音中说话人输出标签排列的问题,然而,现有关于聚类进行说话人分离方法,大多数是优化嵌入使每个源的重建误差最小化。本文以时域卷积网络（ConvTasNet）为基础网络,设计了一种改进基于聚类的门控卷积（Gate-conv Cluster）语音分离方法,在时域上通过堆叠的门控卷积网络,实现端到端深度聚类的源分离。该框架将非线性门控激活用于时域卷积网络中,提取语音信号的深层次特征;同时在高维特征空间中聚类对语音信号的特征进行表示和划分,为恢复不同信号源提供了一个长期的说话者表示信息。该框架解决了说话人输出标签排列问题并对语音信号的长期依赖性进行建模。通过华尔街日报数据集进行实验得出,该方法在SDRi（信源失真比）和Si-SNR（尺度不变信源噪声比）指标上分别达到了16.72 dB和16.33 dB的效果。相似文献

103.

基于Linux平台的电话语音邮件系统 总被引：1，自引：0，他引：1

周建国臧亮杨皓周万江晏蒲柳《武汉大学学报(理学版)》2002,48(1):84-88

通过CTI（电脑语音）技术将电话语音与电子邮件相结合，提出并实现了一个基于Linux平台的电话语音邮件系统，并对原理和各功能模块的实现方法进行了论述。该系统通过Internet和现今十分普及的电信网相连接，提供了一个全方位的交叉的通信平台，拓宽了E-mail的应用领域。相似文献

104.

Time-compressed speech intelligibility in different reverberant conditions

Jędrzej Kociński Dawid Niemiec 《Applied Acoustics》2016

相似文献

105.

Intelligibility of Tracheoesophageal Speech in Noise

Douglas A. McColl 《Journal of voice》2006,20(4):605-615

相似文献

106.

非平稳信号时频分析的重排方法及其在语音信号处理中的应用 总被引：6，自引：0，他引：6

魏庆国吴建华《南昌大学学报(理科版)》2004,28(2):174-177

介绍了非平稳信号时频联合分析中的重排方法。该方法的关键之处在于将代表信号局部能量分布的非线性卷积的值由卷积核的几何中心重排到其质量中心，从而提高时频表示的可读性。给出了几个例子来说明重排方法的应用效果。最后介绍了重排方法在语音信号处理中的应用。相似文献

107.

一种基于分形维数的自适应语音信息隐藏算法 总被引：2，自引：0，他引：2

陈力谢玉琼《武汉大学学报(理学版)》2003,49(3):313-317

为了提高秘密数据传输的可靠性和安全性，本文以语音段分形维数计算为基础，提出一种检测时不需要原始语音信号的自适应语音信息隐藏方法。语音段的秘密信息嵌入位置和数量由人类听觉感知特性和分形维数确定，在保证不可察觉性的前提下，兼顾了鲁棒性和隐藏量的要求，仿真实验表明该算法具有较好的信息隐藏性能。相似文献

108.

限幅语音信号的谱变异

郑义张礼和《浙江大学学报(理学版)》1986,13(4):444-449

本文分析了语音信号的对称限幅引起的短时FFT谱变异,计算了相同音段在不同程度限幅下的谱相关系数。以及不同音素在同样限幅下的谱相关系数、本文为开发实用语音识别系统提供了依据。相似文献

109.

Acoustic and Perceptual Analyses of Brazilian Male Actors' and Nonactors' Voices: Long-term Average Spectrum and the “Actor's Formant”

Suely Master Noemi De Biase Brasília Maria Chiari Anne-Maria Laukkanen 《Journal of voice》2008,22(2):146-154

SUMMARY: This study investigates the possible differences between actors' and nonactors' vocal projection strategies using acoustic and perceptual analyses. A total of 11 male actors and 10 male nonactors volunteered as subjects, reading an extended text sample in habitual, moderate, and loud levels. The samples were analyzed for sound pressure level (SPL), alpha ratio (difference between the average SPL of the 1-5kHz region and the average SPL of the 50Hz-1kHz region), fundamental frequency (F0), and long-term average spectrum (LTAS). Through LTAS, the mean frequency of the first formant (F1) range, the mean frequency of the "actor's formant," the level differences between the F1 frequency region and the F0 region (L1-L0), and the level differences between the strongest peak at 0-1kHz and that at 3-4kHz were measured. Eight voice specialists evaluated perceptually the degree of projection, loudness, and tension in the samples. The actors had a greater alpha ratio, stronger level of the "actor's formant" range, and a higher degree of perceived projection and loudness in all loudness levels. SPL, however, did not differ significantly between the actors and nonactors, and no differences were found in the mean formant frequencies ranges. The alpha ratio and the relative level of the "actor's formant" range seemed to be related to the degree of perceived loudness. From the physiological point of view, a more favorable glottal setting, providing a higher glottal closing speed, may be characteristic of these actors' projected voices. So, the projected voices, in this group of actors, were more related to the glottic source than to the resonance of the vocal tract. 相似文献

110.

A Robust, Real-Time Voice Activity Detection Algorithm for Embedded Mobile Devices 总被引：1，自引：0，他引：1

Bian Wu Xiaolin Ren Chongqing Liu Yaxin Zhang 《Journal of Sol-Gel Science and Technology》1997,8(2):133-146

When an Automatic Speech Recognition (ASR) system is applied in noisy environments, Voice Activity Detection (VAD) is crucial to the performance of the overall system. The employment of the VAD for ASR on embedded mobile systems will minimize physical distractions and make the system convenient to use. Conventional VAD algorithm is of high complexity, which makes it unsuitable for embedded mobile devices; or of low robustness, which holds back its application in mobile noisy environments. In this paper, we propose a robust VAD algorithm specifically designed for ASR on embedded mobile devices. The architecture of the proposed algorithm is based on a two-level decision making strategy, where there is an interaction between a lower features-based level and subsequent decision logic based on a finite-state machine. Many discriminating features are employed in the lower level to improve the robustness of the VAD. The two-level decision strategy allows different features to be used in different states and reduces the cost of the algorithm, which makes the proposed algorithm suitable for embedded mobile devices. The evaluation experiments show the proposed VAD algorithm is robust and contribute to the overall performance gain of the ASR system in various acoustic environments. 相似文献

[首页] « 上一页 [3] [4] [5] [6] [7] [8] [9] [10] 11 [12] 下一页 » 末页»